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THE  GEORGE  WASHINGTON  UNIVERSITY 
Graduate  School  of  Arts  and  Sciences 

Econometric  Research  on  Navy  Manpower  Problems 

THE  ANALYSIS  OF  CONTINGENCY  TABLES  - 
A  METHODOLOGICAL  EXPOSITION* 

by 

S.  Kullback 

1.  Introduction 

The  primary  purpose  of  this  report  is  to  present  an  exposition  of 
the  methodology  underlying  the  analysis  of  the  information  in  contingency 
tables.  We  shall  stress  the  concepts,  techniques,  analyses  and  inferences 
without  entering  into  extensive  technical  statistical  proofs  or  detailed 
references  to  the  bibliography  at  the  end. 

It  Is  useful  to  note  that  we  are  concerned  with  an  aspect  of 
multivariate  (multiple  variates)  analysis  with  particular  application  to 
qualitative  or  categorical  as  well  as  quantitative  variables.  The  basic 
data  we  deal  with  are  counts  in  multiway  cross-classifications  or  multiple 
contingency  tables.  Multiway  contingency  tables,  or  cross-classifications 
of  vectors  of  discrete  random  variables  provide  a  useful  approach  to  the 
analysis  of  multivariate  discrete  data. 

As  we  shall  see,  the  analytic  procedures  serve  to  bring  out  various 
interrelationships  among  the  classlf Icatory  variables  in  a  multiway 
cross-classification  or  contingency  table  in  many  dimensions.  Classical 
problems  in  the  historical  development  of  the  analysis  of  contingency 

A 
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tables  concerned  themselves  with  such  questions  as  the  independence  or 
conditional  independence  of  the  classif icatory  variables,  or  homogeneity 
or  conditional  homogeneity  of  the  classif icatory  variables  over  time  or 
space,  for  example.  Such  classical  problems  turn  out  to  be  special  cases 
of  the  techniques  we  shall  discuss.  These  techniques  result  in  analyses 
which  are  essentially  regression  type  analyses.  As  such  they  enable  us 
to  determine  the  relationship  of  one  or  more  "dependent"  qualitative  or 
categorical  variables  of  interest  on  a  set  of  "independent"  classif icatory 
variables  as  well  as  the  relative  effects  of  changes  in  the  "independent" 
variables  on  the  "dependent"  variables.  In  particular  such  problems  as 
the  determination  of  possible  factors  and  measures  of  their  effect  in 
affecting  failure  or  success  in  boot  camp  or  decisions  as  to  reenlistment 
lend  themselves  to  the  analysis  we  shall  examine. 

The  methodology  is  based  on  the  Principle  of  Minimum  Discrimination 
Information  Estimation,  associated  statistics  and  Analyses  of  Information. 
General  computer  programs  are  available  to  provide  the  data  for  the 
inferences. 

2.  Contingency  Tables 

We  shall  first  present  some  examples  of  contingency  tables  to  help 
clarify  some  of  the  terminology  and,  so  to  speak,  set  the  scene.  We  shall 
use  values  obtained  from  the  Marine  COHORT  File  of  1966. 

The  simplest  example  of  a  contingency  table  is  a  one-way  table  with 
one  classification,  and  several  categories.  The  distribution  of  recruits 
by  home  of  record  is  such  an  example,  with  four  categories. 

TABLE  2.1 


HOME  OF  RECORD 


East 

North 

West 

South 

Total 

4201 

4552 

2840 

5130 

16723 

There  are  not  very  many  interesting  questions  that  may  arise  for 
Table  2.1.  The  most  likely  question  would  be  whether  the  distribution  of 
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the  occurrences  is  consistent  with  the  distribution  of  potential  recruits 
in  the  U.S.  population  by  corresponding  geographical  classification. 

A  two-way  contingency  table  arises  when  each  observation  has  two 
classifications  with  different  possible  numbers  of  categories  for  each 
classification.  An  example  of  a  2x2  two-way  contingency  table  arises  when 
we  distribute  the  recruits  by  Race  and  Success  in  Boot  Camp. 


TABLE  2.2a 


Success  in  Boot  Camp 


Fail 

Pass 

White 

511 

12637 

13148 

Non-white 

73 

1629 

1702 

14266 

14850 

We  index  the  row  categories  by  i  ,  i  -  1  White,  i  «  2  Non-white, 
and  the  column  categories  by  j  ,  j  ■  1  Fail,  j  “  2  Pass,  and  denote  the 
occurrences  by  x(ij)  ,  that  is,  the  notation 


Variable 

Index 

l 

2 

Race 

i 

White 

Non-white 

Boot  Camp  Completion 

j 

Fail 

Pass 

Thus  Table  2.2a  is  represented  as  in  Table  2.2b. 


TABLE  2.2b 
Success  in  Boot  Camp 


Fail,  j-1 

Pass,  j-2 

x(l-) 

x(2-) 

White,  i«l 

x(ll) 

x(12) 

Non-white,  lm2 

x(21) 

x(22) 

x(*l) 

x(*2) 

x(* •)  -  n 

The  sum  of  the  entries  across  a  row  provide  the  corresponding  row  mar¬ 
ginals  and  the  sum  of  the  entries  down  a  column  provide  the  corresponding 
column  marginals.  In  the  notation  a  dot  is  used  to  indicate  summation 
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over  a  particular  index.  For  Tables  2.2a  and  2.2b  the  related  values 
are 

x (11)  =  511 
x(12)  -  12637 
x(21)  =  73 
x(22)  -  1629 

x (1  * )  =  x(Jl)  +  x(12)  -  13148 

x(2* )  -  x(21)  +  x(22)  °  1702 

x(*l)  ■  x(ll)  +  x(21)  ■  584 

x( • 2)  -  x(12)  +  x(22)  -  14266 

x  ( •  * )  -  x(ll)  +  x(12)  +  x(21)  +  x (22)  =  14850 

but  we  usually  use  n  **  x(”)  . 

For  two-way  2x2  tables  the  primary  question  of  interest  is  whether 

the  row  and  column  variables  are  independent.  Thun  in  the  two-way  Table 

2.2a  the  interest  is  in  whether  success  in  boot  camp  is  the  same  for  the 

two  race  categories.  To  answer  this  question  one  estimates  the  cell 

entries  under  the  hypothesis  of  independence  as  a  product  of  the  margin- 

*  * 

als,  that  is,  ucnoting  the  estimate  by  x  (ij)  one  uses  x  (ij)  ** 

x(i*)x(*j)/n  .  Some  appropriate  measure  of  the  deviation  between  x ( 1 .1 ) 

* 

and  x  (ij)  is  then  used  to  determine  whether  the  differences  are 
"larger"  than  one  would  reasonably  expect  under  the  hypothesis  of  indepen¬ 
dence. 

The  estimated  two-way  table  under  the  hypothesis  or  model  of 
independence  is  given  in  Table  2.2c. 

TABLE  2.2c 

ESTIMATE  UNDER  INDEPENDENCE 
x*(ij) 


I  J  - 1  1 

j  -  2 

i  -  1  :  x(l-)x(-l)/n 

x(l*)x(*2)/n 

x(l-) 

i  -  2  x(2* )x(*l)/n 

x(2* )x(*  2)/n 

x(20 

1  x(-l) 

1 

x(  •  2) 

n 
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Note  that  the  estimated  table  has  the  same  marginals  as  the  observed 
table  x(ij)  . 


A  common  statistical  measure  of  the  association  or  interaction 
between  the  variables  of  a  two-way  2x2  contingency  table  is  the  cross- 
product  ratio,  or  its  logarithm.  The  cross-product  ratio  is  defined  by 


(2.1) 


x(ll)x(22) 
x(12)x(21)  * 


though  we  shall  be  more  concerned  with  its  logarithm 


(2.2) 


log 


x(ll)x(22) 
x(12)x(21)  ’ 


We  shall  use  natural  logarithms,  that  is,  logarithms  to  the  base  e  , 
rather  than  common  logarithms  to  the  base  10,  because  of  the  nature  of  the 
underlying  mathematical  statistical  theory.  Note  that  with  the  estimate 
for  independence,  or  no  association,  the  logarithm  of  the  cross-product 
ratio  is  zero. 

*  .  «im«i-..u 

-  n  n 

log  1  -  0  . 


(2.3)  log  X^(11>jl?-21  -  log 


x"(12)x  (21) 


x(l*)x(*2)  x(2* )x(*l) 
n  n 


The  logarithm  of  the  cross-product  ratio  is  positive  if  the  odds  satisfy 
the  Inequalities 


x(ll)  x(12)  x(ll)  x(21) 

x(21)  x(22)  °r  x(12)  x(22)  ’ 


since  then  we  get  for  the  log-odds 

log  .  log  x 

108  x(12)x(21)  108  x(21) 


log 


«qi) 

x<12) 


The  logarithm  of  the  crosB-product  ratio  is  negative  if  the  odds  satisfy 
the  inequalities 


2& 1  <  iiiii 

x(21)  x(22) 


or 


son  <  sim 

x(12)  x(22)  ’ 


since  then  we  get  for  the  log-odds 
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log 


x(ll)x(22) 

x(12)x(21) 


log 


x(ll) 

x(21) 


log 


x<22) 


<  0 


-  log 


*012 

x(12) 


log 


x(22) 


<  0  . 


The  logarithm  of  the  cross-product  ratio  thus  varies  from  -®  to  +<«  . 
Later  we  shall  consider  procedures  for  assessing  the  significance  of  the 
deviation  of  the  logarithm  of  the  cross-product  ratio  from  zero,  the  value 
corresponding  to  no  association  or  no  interaction.  Thus  for  the  two-way 
Table  2.2a  we  have 


.  511  x  1629 

108  73  x  12o37 


log 


832419 

922501 


log  0.902 


-  -0.1031  . 


We  note  that  the  odds  of  failure  for  White  are  511/12637  -  0.04044  and 
the  odds  of  failure  for  Non-white  are  73/1629  ■  0.04481. 

Similar  procedures  apply  to  the  case  of  a  two-way  rxc  contingency 
table,  that  is,  one  with  r  rows  and  c  columns. 


TABLE  2.3a 


TWO-WAY  rxc  CONTINGENCY  TABLE 


1 

2 

•  •  • 

c 

1 

x(ll) 

x(12) 

•  •  • 

x  (lc) 

x(l«) 

2 

x(21) 

x(22) 

•  •  • 

x(2c) 

x(2* ) 

• 

• 

• 

•  e  • 

•  •  • 

■ 

•  •  • 

•  •  • 

r 

x(rlj 

x(r2) 

•  •  • 

x(rc) 

x(r* ) 

x(*l) 

x(-2) 

•  •  • 

x(-c) 

n 

Under  a  hypothesis  or  model  of  independence  of  row  and  column  categories 
x*(ij)  -  x(i»)x(*J)/n  .  Even  if  the  row  categories,  say,  are  not  randomly 
observed  but  selected  with  respect  to  some  characteristic,  say  time  or 
space,  the  mathematical  procedures  are  still  the  same  for  determining 
whether  the  column  categories  are  homogeneous  over  the  row  categories, 
time  or  space  for  instance.  In  the  latter  case  we  may  consider  the  two- 
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way  table  as  a  set  of  one-way  tables.  Terms  which  cover  both  the  case  of 
independence  and  homogeneity  are  "association"  or  "ir  eraction,"  that  is, 
we  question  whether  there  is  association  or  interaction  among  the  variables. 

The  estimated  two-way  rxc  contingency  table  under  the  hypothesis 
or  model  of  Independence  is  given  in  Table  2.3b. 

TABLE  2.3b 

ESTIMATE  UNDER  INDEPENDENCE 


x*(i.l) 


1 

2 

c 

1 

x(l*)x(*l)/n 

x(l-)x(*2)/n 

. . . 

x(l*)x(*c)/n 

x(l* ) 

2 

x(2* )x(* 1) /n 

x(2* )x(*2)/n 

•  •  • 

x(2* )x(*c)/n 

x(2* ) 

• 

• 

• 

•  •  • 

•  •  • 

•  •  • 

•  •  • 

•  •  • 

r 

x(r*  )x(*  l)/n 

x(r*)x(*2)/n 

•  •  • 

x(r* )x(*c)/n 

x(r* ) 

x(*l) 

x(*2) 

•  •  • 

x(*c) 

n 

Note  that  the  estimated  table  has  the  same  marginals  as  the  observed 
Table  2.3a. 

A  three-way  contingency  table  arises  when  each  observation  has  three 
classifications  with  different  possible  numbers  of  categories  for  each 
classification.  The  simplest  three-way  contingency  table  is  2x2x2,  that 
is,  with  two  categories  for  each  classification.  An  example  of  a  three- 
way  2x2x2  contingency  table  is  the  following  cross-classification  of 
recruits  by  AFQT  (I  and  II,  III  and  IV),  Race  (White,  Non-white),  Success 
in  Boot  Camp  (Fail,  Pass). 


TR-1116 


x(*ll)  -  x(lll)  +  x(211) 
x(*12)  -  x(112)  +  x(212) 
x(*21)  -  x(121)  +  x(221) 
x(*22)  -  x(122)  +  x(222)  . 

The  one-way  marginals  are 

x (1 * • )  -  x(lll)  +  x(112)  +  x(121)  +  x(122)  -  x (11  * )  +  x(12‘) 

x(2* • )  -  x(211)  +  x(212)  +  x(221)  +  x(222)  -  x(21‘)  +  x(22‘) 

x (• 1* )  ■=  x(lll)  +  x(112)  +  x(211)  +  x(212)  -  x(ll‘)  +  x(21*) 

x(*2‘)  -  x(121)  +  x(122)  +  x(221)  +  x(222)  ■=  x(12‘)  +  x(22-) 

x ( *  *  1)  -  x(lll)  +  x(121)  +  x(211)  +  x(221)  -  x(l*l)  +  x(2*l) 

x(* • 2)  -  x(112)  +  x(122)  +  x(212)  +  x(222)  -  x(l*2)  +  x(2*2) 

The  entries  x(ljk)  In  Table  2.4b  may  also  be  considered  as  three-way 
marginals. 

With  more  variables  there  are  more  possible  questions  of  Interest. 

One  may  be  itterested  In  whether  any  pair  of  the  variables  are  independent 
or  show  no  Interaction  or  association.  One  vay  be  interested  in  condi¬ 
tional  independence,  that  is,  whether  a  pair  of  variables  are  independent 
given  the  third  variable.  One  may  be  interested  in  whether  the  three 
variables  are  mutually  independent  or  whether  one  of  the  variables  is 
Independent  of  the  pair  of  the  other  variables.  These  questions  of  inde¬ 
pendence,  no  Interaction  or  association  are  all  answered  by  considering 
estimates  which  are  explicitly  represented  in  terms  of  products  of 
various  marginals.  We  list  some  of  these  estimates. 

Mutual  independence  of  i,  j,  and  k  x*(ijk)  -  x(i* • )x(* J • )x ( • *k)/n^ 

Independence  of  i  and  (jk)  jointly  x*(ijk)  -  x(i* • )x(* jk)/n 

Conditional  independence  of  i  and  j  given  k  x^(ijk)  ■  x(i*k)x(*jk)/x(* *k) 

Aa  might  be  expected,  these  estimates  also  apply  in  the  general  three-way 
rxaxt  contingency  table. 

We  note  that  the  estimate  under  mutual  independence  of  i  ,  j  , 
and  k  has  the  same  one-way  marginals  as  the  observed  table  x(ijk)  . 
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x  (111)  =  >:(1*  •  )x(*  1*  )x(*  *  1)  /n“ 
x*(l  12)  =  x(l*-)xf>l*)x(**2)/n2 
x*(121)  =  x(l*  •  )x(*  2*  )x(*  •  1  )/n^ 
x*(122)  -  x(l*«  )x(-2*)x(-*2)/n2 
x* (211)  =  x(2*  ■ )x(* 1* )x(*  •  1 ) / n  2 
x*(212)  =  x(2* • )x(* 1* )x(* • 2)/n“ 
x*(221)  =  x(2* • )x(* 2* )x(* • 1) /n2 
x*(222)  =  x(2* • )x(* 2* )x(* • 2)/n2 
x*(l--)  -  xj(lll)  +  x* (112)  +  x*(121)  +  x*(122) 

«  x(l**)x(*l*)/n  +  x(l**)x(,2*)/n 
«  x  ( 1  •  •  ) 

x*(2‘ • )  =  x* (211)  +  x*(212)  +  x* (221)  +  x*(222) 

=  x(2*,)x(*l*)/n  +  x(2* * )x( • 2* )/n 
■  x(2‘ • ) 

xj(-l-)  -  xj(lll)  +  x* (112)  +  x*(211)  +  x* (212) 

*  x(l**)x(*l*)/n  +  x(2**)x(*l,)/n 
=  x(*l*) 

x*( *2* )  »  X*(121)  +  x*(122)  +  x* (221)  +  x*(222) 

-  x(*2* ) 

x*  ( •  *  1)  =■  xj(lll)  +  x*(121)  +  x*(211)  +  x*  (221) 

-  x( • *1) 

x*( • *2)  -  x*(112)  +  x* (122)  4  x* (212)  +  x*(222) 

-  x( • *2) 

However,  the  two-way  marginals  of  the  estimate  under  mutual  independence 
of  i  ,  J  ,  and  k  differ  from  the  two-way  marginals  of  the  observed 
table  x(ijk)  .  Thus,  for  example  , 


10  - 


TR-1116 


X*(11‘)  -  x*(lll)  +  x* (112) 

■  x(l**)x(*l*)x(**l)/n^  +  x(l* • )x(* 1* )x(* *2)/n^ 

-  x(l* • )x(*l* )/n  , 

and  the  latter  value  is  not  necessarily  equal  to  x(ll*)  . 

The  estimate  under  the  hypothesis  or  model  of  independence  of  i 
and  (jk)  jointly  has  the  same  one-way  marginals  and  the  same  two-way 
jk-raarginal  as  the  observed  table  x(ijk)  . 

x*(lll)  -  x(l* • )x(* ll)/n 
a 

x* (112)  -  x(l* • )x(* 12)/n 
a 

x*(121)  -  x(l* • )x(*  21)/n 
a 

x*(122)  -  x(l* • )x(*22)/n 
a 

x* (211)  -  x(2* • )x(* 11) /n 
a 

x* (212)  -  x(2* • )x(*12)/n 
a 

x* (221)  -  x(2* • )x(*21)/n 
a 

x*(222)  -  x(2*  *)x(*22)/n 
a 

x*(l")  -  x*(lll)  +  x*(112)  +  x*(121)  +  x*(122) 
a  a  a  a  a 

-  x(l* *)x(*ll)/n  +  x(l* •)x(*12)/n  +  x(l* -)x(*21)/n  +  x(l‘*)x(*22)/n 

-  x(l**)[x(*  11 )  -t-  x(*12)  +  x(*21)  +  x(*22) ]/n 

-  x (1  •  * ) 

Similar  results  follow  for  the  other  one-way  marginals. 

xVll)  -  x*(lll)  +  x*(211) 
a  a  a 

-  x(l»*)x(»ll)/n  +  x(2* *)x('ll)/n 

-  x(-ll) 

x*(-12)  -  x*(112)  +  x*(212) 

a  a  a 

-  x(l-)»!’12)/n  +  x(2*  •)x(*12)/n 

-  x(-12) 
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x* (• 21)  -  x*(121)  +  x*(221) 
a  a  a 

*  x(l* • )x(*21)/n  +  x(2* * )x(*21)/n 

-  x(*21) 

x*(-22)  -  x*(122)  +  x* (222) 

cl  a  a 

-  x(l* • )x(*22)/n  +  x(2* *)x(*22)/n 

-  x(*22) 

However,  for  the  other  two-way  marginals,  for  example, 

x*(llO  -  x*(lll)  +  x*(112) 
a  a  a 

*  x(l* • )x(* ll)/n  +  x(l* * )x(*12)/n 

*  x(l**)[x(*ll)  +  x(*12) ]/n 

-  x(l**)x(*l*)/n  , 

and  the  latter  value  is  not  necessarily  equal  to  x(ll*)  . 
x*(l-l)  -  x*(lll)  +  x*(121) 

a  a  a 

-  x(l,#)x(*lJVn  +  x  (1  *  *  )x(*21)/n 

-  x(l**)[x(*ll)  +  x(*21)]/n 

-  x(l* • )x(*  *l)/n  , 

and  the  latter  value  Is  not  necessarily  equal  to  x(l*l)  . 

The  estimate  under  the  hypothesis  or  model  of  conditional  inde¬ 
pendence  of  i  and  j  given  k  has  the  same  one-way  marginals  and  the 
same  two-way  ik-  and  jk-marginals  as  the  observed  table  x(ijk)  . 

xj(lll)  -  x(l*l)x(*ll)/x(* *1) 
x£(112)  -  x(l*2)x(*12)/x(**2) 
x£(121)  -  x(l,l)x(*21)/x(»*l) 
x£ (122)  -  x(l-2)x(-22)/x(*-2) 
xj(211)  -  x(2*l)x(*ll)/x(**l) 
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<212)  -  x(2*2)x(*12)/x(**2) 
xj(221)  -  x(2*l)x(*21)/x(*  *1) 
x£ (222)  -  x(2*2)x(*22)/x(**2) 
xj(l-*)  -  xj(lll)  +  x£ (112)  +  xj(121)  +  x^ (122) 

-  x(l*l)x(*ll)/x(“l)  +  x(l*2)x(*12)/x(,,2) 

+  x(l*l)x(*21)/x(**l)  +  x(l*2)x(*22)/x(**2) 
■  x(l’l)  +  x(l*2)  «  x(l“)  . 

Similar  results  follow  for  the  other  one-way  marginals. 

xj(l-l)  -  xj(lll)  +  xj<121) 

-  x(l* l)x(* ll)/x(* *1)  +  x(l*l)x(*21)/x(* *1) 

-  x(l* 1) 

xJ(l-2)  -  x£(112)  +  x£(122) 

-  x(l*2)x(*12)/x(**2)  +  x(l,2)x(*22)/x(**2) 

-  x(l* 2)  , 

and  in  a  similar  manner  we  have 

x£(2-1)  -  x(2* 1)  ,  xj(2*2)  -  x(2-2) 

:^(-ll)  -  xj(lll)  +  xj(211) 

-  x(l*l)x(’ll)/x(”l)  +  x(2*l)x(*ll)/x('*l) 

-  x(*ll) 

x£(-12)  -  xj(112)  +  x£ (212) 

-  x(l*2)x(*12)/x(*,2)  +  x(2*2)x(*12)/x(**2) 

-  x(-12)  , 

and  in  a  similar  manner  we  have 

x£(-21)  -  x(-21)  .  x£<-22)  -  x(*22)  . 

However,  for  the  other  two-way  marginals 

( 
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xj(ll')  -  x£(lll)  +  xj(112) 

■  x(l*l)x(* ll)/x(* *1)  +  x(l*2)x(*12)/x(* *2)  , 

and  the  latter  value  is  not  necessarily  equal  to  x(ll*)  . 

We  remark  that  one  of  the  constraints  in  the  deterrination  of  the 
estimates  was  that  they  have  certain  marginals  the  same  as  the  observed 
table. 


For  the  three-way  table  in  addition  to  the  types  of  independence, 
infraction  or  association  just  discussed,  there  arises  an  additional  one, 
important  historically  and  practically.  This  is  known  as  no  three-factor 
or  no  second-order  interaction.  No  three-factor  or  no  second-order 
interaction  implies  that  the  logarithm  of  the  association  measured  by  the 
cross-product  ratio  for  any  two  of  the  variables  is  the  same  for  all  the 
values  of  the  third  variable,  that  is,  there  is  no  second-order  interaction 


if 


(2.4) 


0n  x(lll)x(221) 
n  x(121)x(2li) 

x(lll)x(212) 
n  x(112)x(211) 

tn 

n  x(112)x(121) 


5n  x(112)x(222) 
x(122)x(212)  ’ 

*n  .x112„1)x(222i) 

°  x(122)x(221)  » 

&n  x(211><;222) 
n  x(212)x(221)  * 


i,  J 

i,  k 

j,  k  . 


One  is  concerned  with  the  possible  hypothesis  or  model  of  no 
second-order  interaction  when  none  of  the  other  types  of  independence  are 
found.  However,  in  this  case,  the  corresponding  estimate  cannot  be  ex¬ 
pressed  explicitly  in  terms  of  observed  marginals  although  the  estimate 
is  constrained  to  have  the  same  two-way  marginals  as  the  observed  table. 
Straightforward  iterative  procedures  exist  to  determine  the  estimate 
under  the  hypothesis  or  model  of  no  second-order  interaction.  For  the 
general  three-way  rxsxt  contingency  table  there  are  of  course  many  more 
relations  among  the  log  cross-product  ratios  like  (2.4)  which  must  be 
satisfied,  but  the  iterative  procedures  to  determine  the  estimate  extend 
to  the  general  case  with  no  difficulty. 


We  may  be  concerned  with  a  set  of  two-way  tables  for  which  it  is 
of  interest  to  determine  whether  they  are  homogeneous  with  respect  to  a 
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third  factor,  say  space  or  time.  Such  problems  may  also  be  treated  as 
three-way  contingency  tables  using  the  space  or  time  factor  as  the  third 
classification. 

For  four-way  and  higher  order  contingency  tables  the  problem  of 
presentation  of  the  data  increases,  as  do  the  variety  and  number  of  ques¬ 
tions  about  relationships  of  possible  interest  and  varieties  of  interaction. 
The  basic  ideas,  concepts,  notation  and  terminology  we  have  discusser  for 
the  one-,  two-  and  three-way  contingency  tables  extend  to  the  more  general 
cases  as  we  conside.  the  methodology. 

3.  Discrimination  Information 


To  make  the  discussion  more  specific  and  with  no  essential  restric¬ 
tion  on  the  generality,  we  shall  present  it  in  terms  of  the  analysis  of 
four-way  contingency  tables.  Let  us  consider  the  collection  of  four-way 
contingency  tables  RxSxTxU  of  dimension  rxsxtxu  .  For  convenience  let 
us  denote  the  aggregate  of  all  cell  identifications  by  12  with  individual 
cells  identified  by  u  so  that  the  generic  variable  is  to  -  (i,j,k,£)  , 

i  *  1 . .  J  -  1 , . . . , s ,  k  ■  l,...,t,  i  ■  l,...,u  .  Suppose  there  are 

two  probability  distributions  or  contingency  tables  (we  shall  use  these 
terms  Interchangeably)  defined  over  the  space  12  ,  say  p(io),  tt  (to) , 

E  p(<u)  ■  1,  L  it  (to)  -  1  .  The  discrimination  information  is  defined  by 

12  12 

(3.1)  I(p:ff)  -  E  p(to)  In  . 

The  basis  for  this  definition,  its  properties,  and  relation  to  other 
definitions  of  information  measures  will  not  be  considered  in  detail  in 
this  exposition.  For  the  particular  types  of  application  to  which  we 
shall  restrict  this  exposition  the  Tr-distribution,  tt(io)  ,  in  the  definition 
(3.1)  according  to  the  problem  of  interest  may  either  be  specified,  or  it 
may  be  an  estimated  distribution.  The  p-distribution,  p(u>)  ,  in  the 
definition  (3.1)  ranges  over  or  is  a  member  of  a  family  of  distributions 
of  Interest. 

Of  the  various  properties  of  I (p : 7t)  we  mention  in  particular  the 
fact  that  I  (p : tt )  >  0  and  ■  0  if  and  only  if  p(u>)  ■  tt  (to)  . 


-  15  - 


TR-  1L1() 


4 .  Mi nimun  Discrimination  Information  Estimation 

Many  probli  us  L.,  .lie  analysis  ol  contingency  tables  may  be  charac¬ 
terized  as  estimating  a  distribution  or  contingency  table  subject  to 
certain  restraints  and  then  comparing  the  estimated  table  with  an  observed 
table  to  determine  whether  the  observed  table  satisfies  a  null  hypothesis 
or  model  implied  by  the  restraints.  In  accordance  with  the  principle  of 
minimum  discrimination  information  estimation  we  determine  that  member  of 
the  collection  or  family  of  p-distributions  satisfying  the  restraints 
which  minimizes  the  discrimination  information  I(p:7T)  over  all  members 

of  the  family  of  pertinent  p-distributions.  We  denote  the  minimum  dis- 

* 

crimination  information  estimate  by  p  (ai)  so  that 

* 

(4.1)  I(p  :tt)  =  E  p  <uj)  £n  P— *  min  I  (p : tt )  . 

71  {(j)) 

Unless  otherwise  stated,  the  summation  is  over  il  which  will  be  omitted. 

In  a  wide  class  of  problems  which  can  be  characterized  as  "smoothing" 
or  fitting  an  observed  contingency  table  the  restraints  specify  that  the 
estimated  distribution  or  contingency  table  have  some  set  of  marginals 
which  are  the  same  as  those  of  an  observed  contingency  table.  In  such 
cases  tt (oj)  is  taken  to  be  either  the  uniform  distribution  ti(ijk£)  ■ 

1/rstu  or  a  distribution  already  estimated  subject  to  restraints  contained 
in  and  Implied  by  the  restraints  under  examination.  The  latter  case 
includes  the  classical  hypotheses  of  independence,  conditional  independence, 
homogeneity,  conditional  homogeneity  and  interaction,  all  of  which  can  be 
considered  as  instances  of  generalized  independence  and  will  be  considered 
in  some  detail  in  this  report.  By  generalized  independence  is  meant  the 
fact  that  the  estimates  may  be  expressed  as  a  product  of  factors  which  are 
functions  of  appropriate  marginals. 

5 .  Minimum  Discrimination  Information  Statistic 

To  test  whether  an  observed  contingency  table  is  consistent  with 
the  null  hypothesis  or  model  as  represented  by  the  minimum  discrimination 
information  estimate  we  compute  a  measure  of  the  deviation  between  the 
observed  distribution  and  the  appropriate  estimate  by  the  minimum  discrim¬ 
ination  information  statistic.  For  notational  convenience  and  later 
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computational,  convenience  let  us  denote  the  estimated  contingency  table 

it  * 

in  terms  of  occurrences  by  x  (w)  ■  np  (iij)  .  For  the  "smoothing"  or 
fitting  class  of  problems,  that  is,  with  the  restraints  implied  by  a  set 
of  observed  marginals  (those  of  a  generalized  independence  hypothesis), 
the  minimum  discrimination  information  (m.d.i.)  statistic  is 

(5  1)  2I(x:x*)  =  21  x(u)  in 

x  (w) 

2 

which  is  asymptotically  distributed  as  a  x  with  appropriate  degrees  of 
freedom  under  the  null  hypothesis. 

The  statistic  in  (5.1)  is  also  minus  twice  the  logarithm  of  Lhe 
classic  likelihood  ratio  statistic  but  this  is  not  necessarily  true  for 
other  kinds  of  applications  of  the  general  theory. 

6 .  Minimum  Discrimination  Information  Theorem 

We  now  present  a  theorem  which  is  the  basis  for  the  principle  of 
minimum  discrimination  information  estimation  and  its  applications.  We 
shall  present  It  in  a  form  related  to  the  context  of  this  discussion  on 
the  analysis  of  contingency  tables. 

Let  us  consider  the  space  ft  mentioned  in  Section  3  and  the  dis¬ 
crimination  information  introduced  in  (3.1).  Suppose  now,  for  example, 
that  there  are  three  linearly  independent  statistics  of  interest  defined 
over  the  space  SI  , 

(6.1)  Tl(u)  ,  T2(u)  ,  T3(w)  . 

Let  us  determine  the  value  of  p(w)  which  minimizes  the  discrimination 
information 

(6 . 2)  I (p : tt)  -  E  p(u)  *n 

over  the  family  of  p-distributions  which  satisfies  the  restraints 
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(6.3) 


k  k 

where  0^  ,  f)0 
button . 


•  T1(<',)p('*>)  =  ^ 

>■  'L‘2  (ui)  p  (cu)  =  02 

Z  T^(w)p(oj)  =  e* 

k 

are  specified  values,  and  tt(h))  is  a  fixed  di.strl- 


If  7i  (u)  satisfies  the  restraints  (6.3),  then  of  course  the 

minimum  value  of  i(p:r)  is  zero  and  the  minimizing  distribution  is 

k 

p  (w)  =  T!  (to)  .  More  generally,  the  minimum  discrimination  information 
theorem  states  that  tne  minimizing  distribution  is  given  by 


(6.4) 


where 


A 

p  ('-) 


exp  (r^T^co)  +  t2'f,,  (ui)  +  X  3  i'3(io)  )  it  (w) 

i«v7j) 


(6.5)  -  )  exp  ( X LTi<ui>  +  ^Tgut)  +  i <uj) ) XT (uj) 

is  a  normalizing  factor  so  that  Z  p  (uj)  =  1  ,  and  the  l ’s  are  para¬ 
meters  which  technically  are  in  essence  undetermined  Lagrange  multipliers 

*  *  * 

whose  values  are  defined  in  terms  of  0^  ,  0^  ,  0^  by 


*  3 

0-  =  - -  in  M( r  i » 1 2  >  r  3) 


1  3t 


=  (Z  exp  (t1Tjl(1o>  +  T2T2 (°J)  +  T  3T3 C(<J) ) T x  ( to)  n  ( .o> )  /M ( t  j  ,  t ,  r  3 > 


*  Y  T1(uj)p  ( .)) 


02  =  -g~-  ?-n  M(x,  ,  i  ,,!,) 


■  1  ’  2  ’  y 


(6.6) 


=  (Z  exp  (.  1'fJf.)  +  t2t2(^)  +  x  3T3(w))T2(w)7i(w))/M(x1,x2,  r3) 
=  Z  T2(u)p  (u.) 


°3  St,  4,0  ‘-I(l  1  '  2  ’ 


3  3x  3  4,1  i’  2  ’  1  3' 

=  (Z  exp(  :  ,T ,  fee)  4-  x2T2(a))  +  i3T3<o))T3(io)n(u)))/M(x1,x2,  t3) 


=  I  X3(w)p  (u)  . 

We  can  now  state  a  number  of  consequences  of  the  preceding. 


-  18  - 


TK-  Mil. 


We  note  first  that  p  (u)  is  a  member  of  an  exponential  family  of 
distributions  generated  by  ^(w,  and  as  such  has  the  desirable  statistical 
properties  of  members  of  an  exponential  family  which  include  all  the 
common  and  classic  distributions.  We  may  also  write  (6. A)  as 

•k 

(6 • 7)  f-n  -  -in  +  T 1T1  (w)  +  t2T2(lj)  +  c.jT^ui) 

=  L  +  +  t2T2(u)  +  t3T3(ai) 

with  L  =  -in  ^(t^>t2*t3)  •  The  regression  or  log-linear  expression  In 
* 

(6.7)  for  in  (p  (to)  /n (to) )  with  T3(u>)  ,  T2(m)  ,  ^(w)  as  the  explanatory 

variables  and  Ti  »  x2  »  T3  as  the  regression  coefficients  plays  an  impor¬ 
tant  role  in  the  analysis  we  shall  consider. 

We  note  next  that  the  minimum  value  of  the  discrimination  information 
(6.2)  is 

(6.8)  i (p* : tt )  =  i^e*  +  x2e*  +  t3e*  -  in  m(t1m2,t3) 

k 

where  the  0  's  are  defined  in  (6.3)  and  the  t's  are  determined  to 
satisfy  (6.6).  Using  the  value  in  (6.7)  it  may  be  shown  that  if  p(w)  is 
any  member  of  the  family  of  distributions  satisfying  (6.3),  then 

(6.9)  I (p :tt)  -  I(p*:r)  +  I(p:p  )  . 

The  pythagorean  type  property  (6.9)  plays  an  important  role  in  the  analysis 
of  information  tables. 

7 .  Computational  Procedures 

An  "experiment"  has  been  designed  and  observations  made  resulting 
in  a  multi-dimensional  contingency  table  with  the  desired  classifications 
and  categories.  All  the  information  the  analyst  hopes  to  obtain  from  the 
"experiment"  is  contained  in  the  contingency  table.  In  the  process  of 
analysis,  the  aim  is  to  fit  the  observed  table  by  a  minimal  or  parsimonious 
number  of  parameters  depending  on  some  or  all  of  the  marginals,  that  is. 
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to  find  out  how  much  of  this  total  information  is  contained  in  a  summary 
consisting  of  sets  of  marginals.  Indeed,  the  relationship  between  the 
concept  of  independence  or  association  and  interaction  in  contingency 
tables  and  the  role  the  marginals  play  is  evidenced  in  the  historical 
developments  in  the  extensive  literature  on  the  analysis  of  contingency 
tables.  Thus,  the  0*'s  in  the  preceding  discussion  will  be  the  mar¬ 
ginals  of  Interest. 


7.1.  The  T(<d)  Functions.  The  T(u>)  functions  for  the  RxSxTxU 
table  turn  out  to  be  a  basic  set  of  simple  functions  and  their  various 
products.  Thus,  for  example,  the  T(w)  function  associated  with  the 
one-way  marginal  p(2...)  is 

(7.1)  T^ijki)  -  1  for  i  -  2  ,  any  j,k,£ 

-  0  otherwise 

since 

(7.2)  Z  p(ijk£)  T^ijki)  -  p(2...)  . 


Similarly  the  T(u)  function  associated  with  the  one-way  marginal  p(..3.)  , 
for  example,  is 

(7.3)  T*(ijk£)  -  1  for  k  =  3  ,  any  i,j,£ 

-  0  otherwise 


since 


(7.4) 


Z  p(ijk£)  T*(ijk£)  -  p( . . 3. )  . 


Thus  for  the  rxaxtxu 
(r-1)  linearly 
(s-1)  linearly 

(7.5) 

(t-1)  linearly 
(u-1)  linearly 


table  there  are 

£ 

independent  functions  Ta(ijk£),  a 
Independent  functions  1^(ijk£),  8 
independent  functions  T*(ijk£),  y 
Independent  functions  T^(ijk£),  6 


1 . r-1 

1, . . . ,8—1 

1. .  . . ,t-l 

1. .  . . ,U~1  , 


since,  for  example, 

r  D 

Z  Z  T*(ijk£)  -  rstu  . 
a*l 


1 
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We  have  arbitrarily  excluded  the  functions  corresponding  to  a  =  r, 

0  ■  8,  y  “  t,  6  =  u  as  a  matter  of  convenience.  We  could  have  selected 
a  3  1,  0  ■  1,  y  ■  1 ,  6  ■  1  or  any  other  set  of  values. 


The  T(w)  function  associated  with  the  two-way  marginal  p(12..) 
say,  is  T^(ijk£)  T^ijkf)  since  from  the  definition  of  T ^ ( i j k )  and 
T^CijkH)  it  may  be  seen  that 

(7.6)  T^(ijk£)  T^ijkZ)  -  1  for  i  =  1,  j  =  2,  any  k,£ 

=  0  otherwise 

and 

(7.7)  E  p(ijk£)  Tj(ijk£)  v|(ijkfi.)  -  p(12..)  . 

R  C  DC 

For  convenience  we  shall  write  T  (ijk£)  TR(ijk£)  =  T  (ijk£)  ,  etc.  Thus 

u  ot  p 

the  T(u)  function  associated  with  any  two-way  marginal  is  a  product  of 
two  appropriate  functions  of  the  set  (7.5). 


Similarly  the  T(u)  function  associated  with  any  three-way  marginal 
will  be  a  product  of  three  of  the  appropriate  functions  of  the  set  (7.5), 
for  example, 

(7.8)  E  p(ijk£)  Tf(ijka)  T^(ijkZ)  T^ijkO  *  P(2.13)  . 


d  c  t  uct 

For  convenience  we  shall  write  T^(ijk£)  Tg(ijk£)  T^(iJkH)  =  » 

etc. 


Similarly  the  T(to)  function  associated  with  any  four-way  marginal 
will  be  a  product  of  four  of  the  appropriate  functions  of  the  set  (7.5), 
for  example, 

(7.9)  E  p(ijk£)  T*(ijk£)  T®(ijk£)  T*(ijk£)  T^ijkf)  -  p(2112)  . 


For  convenience  we  shall  write 
„RSTU, 


la0y5 


(ijk£) 


T*(ijk£)  Tg(ijk£)  T*(ijk£)  T^(ijk£)  = 
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We  note  that  there  are  a  total  of 


(r-1)  +  (s-l)  +  (t-1)  +  (u-1) 

(r-1) (b-1)  +  (r-1) (t-1)  +  (r-1) (u-1)  +  (s-l) (t-1) 

+  (s-l) (u-1)  +  (t-1) (u-1) 

(r-1) (8-1) (t-1)  +  (r-1) (a-l) (u-1)  +  (r-1) (t-1) (u-1) 
+  (8-1) (t-1) (u-1) 

(r-1) (s-l) (t-1) (u-1)  , 


respectively,  of  the  simple  linearly  independent  functions  and  their 
products  two,  three,  four  at  a  time.  It  may  be  verified  that 


(7.11)  rstu  -  1  -  N  -  +  N2  +  N3  +  N4  . 

These  values  of  the  number  of  T(ui)  functions  (or  associated  tau  para¬ 
meters)  appear  as  appropriate  degrees  of  freedom  in  the  analysis  of 
information  tables. 


7.2.  The  Estimated  p  (m)  Values.  In  the  usual  least  squares 
regression  analysis  procedure,  one  first  computes  the  regression  coeffi¬ 
cients  and  then  gets  the  values  of  the  estimates.  In  the  methodology  we 
use  we  reverse  the  procedure.  Instead  of  trying  to  obtain  the  values  of 

the  x's  from  (6.6)  (which  is  possible)  we  shall  first  obtain  the  values 

* 

of  the  estimates  p  (w)  by  a  straightforward  convergent  iterative 
procedure  and  then  derive  the  values  of  the  x's  from  (6.7).  We  shall 
not  discuss  the  details  of  the  iteration  here,  as  they  are  in  the  computer 
program  and  have  been  described  elsewhere.  The  iteration  may  be  described 
as  successively  cycling  through  adjustments  of  the  marginals  of  interest 
starting  with  the  tt(w)  distribution  until  a  desired  accuracy  of  agree¬ 
ment  between  the  set  of  observed  marginals  of  interest  and  the  computed 
marginals  has  been  attained. 


7.3.  The  x  Values  or  Interaction  Parameters.  From  the  definitions 
of  the  T(u>)  functions  in  Section  7.1  it  is  clear  that  they  take  on  only 
the  values  0  or  1  for  each  value  of  w  .  From  the  nature  of  the  T(w) 
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functions  thf  j,et  of  regression  or  log-linear  Equations  (6.7)  will  have 
some  with  a  single  t  value  which  can  be  determined.  Then  there  will  be 
a  set  with  one  additional  unknown  value  and  some  of  the  t's  already 
determined.  These  new  unknown  t  values  can  be  then  determined.  This 
process  of  successive  evaluation  is  carried  on  until  all  the  values  of 
t  are  determined.  They  are  also  available  as  output  of  a  general  com¬ 
puter  program. 


8.  Graphic  Representation 


A  useful  graphic  representation  of  the  log-linear  regression  (6.7) 
is  given  in  Figure  8.1  for  a  2x2x2x2  contingency  table.  This  is  the 
analogue  of  the  design  matrix  in  normal  regression  theory.  The  blank 
spaces  in  Figure  8.1  represent  zero  values.  The  (ijki) -columns  are  the 
cell  identifications  in  the  same  lexographic  order  as  the  cell  entries 
for  the  estimates  in  the  computer  output.  Column  1  corresponds  to  L 
which  is  essentially  a  normalizing  factor.  Each  of  the  columns  2  to  16 
represents  the  corresponding  values  of  the  T(u>)  functions,  columns  2 
to  5  those  for  the  one-way  marginals,  columns  6  to  11  those  for  the  two- 
way  marginals,  columns  12  to  15  those  for  the  three-way  marginals,  and 
column  16  that  for  the  four-way  marginal.  For  convenience  the  columns 
are  also  arranged  in  lexographic  order.  The  tau  parameter  associated 
with  the  T(<jj)  function  is  given  at  the  head  of  the  column.  The  full 
representation  with  all  the  columns  of  Figure  8.1  generates  the  observed 
values.  Thus  the  rows  represent 


(8.1) 


in 


p(ijki) 
tt  (ijki) 


in 


x(ijki) 

nir(ijki) 


L  +  x*Tj(ijki)  +  ...  +  i^T^(ijki) 


+ 


. . .  + 


ijkTiJk 
111  111 


(ijki)  +  ... 


ijki  ijki 
1111  1111 


(ijki) 


where  Tr(ijki)  in  the  2x2x2x2  case  is  1/2x2x2x2  and  the  numerical 
values  of  L  and  the  taus  depend  on  the  observed  values  x(ijki)  .  The 
design  matrix  corresponding  to  an  estimate  uses  only  those  columns  asso¬ 
ciated  with  the  marginals  explicit  and  implied  in  the  fitting  process. 

This  is  a  reflection  of  the  fact  that  higher  order  marginals  imply  certain 
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u  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16 


lower  order  marginals,  for  example,  the  two-way  marginal  x(ij..)  Implies, 

by  summation  over  1  and  J  ,  the  one-way  marginals  x(.J..)  ,  x(l...)  , 

and  the  total  n  *  x(....)  .  Thus  the  estimate  based  on  fitting  the  one-way 

marginals  will  use  only  columns  1-5.  The  values  of  L  and  the  taus  for  this 

estimate  will  be  different  from  those  for  x(ijki)  and  depend  on  the  esti- 
* 

mate  x..(iJkJl)  .  Thus  If  we  denote  the  estimate  based  on  fitting  the  one-way 
1  * 

marginals  as  x^(ijki)  ,  the  representation  in  Figure  8.1  implies 


x*(llll) 

I  Jin  -=— - 


nn 


1  .  j  .  k.  I 

L  + 


In 


xj(1112) 

nir 


T  ,  i  ,  j  ,  k 

L  +  Tj  + 


(8.2) 


In 


xx(2222) 


nir 


•  L 


-  24  - 


TK-1  I  1(> 


( 


The  estimate  based  on  fitting  the  two-way  marginals  will  use  columns  1- L 1 
since  the  two-way  marginals  also  imply  the  one-way  marginals.  The  values 
of  L  and  the  taus  for  this  estimate  will  be  different  from  those  for  the 

observed  values  or  other  estimates  and  depend  on  the  values  of  the  estimate 

* 

which  we  denote  by  x2(ijk£)  .  For  the  estimate  fitting  the  two-way 
marginals  the  representation  in  Figure  8.1  implies 


x2(llll) 

nir 


.  .  1  .  i  ^  k.  ,  l  .  ij  .  ik  i£  jk 
-  L  +  TX  +  +  TU  +  TU  +  TU  +  Tn  + 


j<.  kft 
T11  +  T11 


(8.3) 


x2(1112) 

nir 


x2(2222) 

mr 


L  +  T 


+  +  tT"  +  T 


•fi 


ik 

:11 


jk 

11 


The  representation  for  the  uniform  distribution  corresponds  to  column  1  only. 
Note  that  in  accordance  with  (7.10)  and  (7.11) 


N^*l+1+1+1  ■  4  (columns  2  to  5) 

N2*l+1+1+1+1+1*6  (columns  6  to  11) 
N^^l+l  +  l+  l  *4  (columns  12  to  15) 

N.  ■  1  ■!  (column  16) 


N  -  16  -  1 


15 


■4  +  6  +  4+  1 


9 .  Analysis  of  Information 


Although  the  preceding  discussion  has  at  times  been  in  terms  of 
probabilities,  estimated  probabilities  or  relative  frequencies,  in  practice 
it  has  been  found  more  convenient  not  to  divide  everything  by  n  ,  the  total 
number  of  occurrences,  and  deal  with  observed  or  estimated  occurrences,  that 

is,  with  n7r(ijkJl)  ■  n/rstu  ,  x(ijki)  ,  x(i...)  ,  x(.jk.)  ,  x  (ijk£)  - 

* 

np  (ijk£)  ,  etc.  The  analysis  of  information  is  based  on  the  fundamental 
relation  (6.9)  for  the  minimum  discrimination  information  statistics.  Spe- 

*  it 

cifically  if  np  (w)  ■  x  (w)  is  the  minimum  discrimination  information 
®  ®  * 

estimate  corresponding  to  a  set  of  given  marginals  and  x^w)  Is  the 
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minimum  discrimination  information  estimate  corresponding  to  a  set 
of  given  marginals,  where  is  explicitly  or  implicitly  contained  in 

,  then  the  basic  relations  are 


(9.1) 


21(x:rtn)  *  21(x  :nn)  +  21(x:x*) 
a  a 

2I(x:mt)  ■  2I(x£:nit)  +  2I(x:x£) 
2I(x£:nir)  -  2I(x*:n7r)  +  2I(x£:x*) 
2I(x:x*)  -  2I(x£:x*)  +  2I(x:x£) 


with  a  corresponding  additive  relation  for  the  associated  degrees  of 
freedom. 


In  terms  of  the  representation  in  (6.4)  or  (6.7)  or  Figure  8.1  as 
an  exponential  family,  for  our  diccusslon,  the  two  extreme  cases  are  the 
uniform  distribution  for  which  all  r's  are  zero,  and  the  observed  con¬ 
tingency  table  or  distribution  for  which  all  N  ■  rstu  -  1  t'b  are 
needed. 


Measures  of  the  form  21(x:x*)  ,  that  is,  the  comparison  of  an 

A 

observed  contingency  table  with  an  estimated  contingency  table,  are  called 
measures  of  interaction  or  goodness-of-fit.  Measures  of  the  form 
2I(x^:xfl)  ,  comparing  two  estimated  contingency  tables,  are  called  mea¬ 
sures  of  effect,  that  is  the  effect  of  the  marginals  in  the  set  but 

not  in  the  set  H  or  the  taus  in  x^  but  not  in  x*  .  We  note  that 
a  Da 

2I(x:x*)  tests  a  null  hypothesis  that  the  values  of  the  t  parameters  in 
the  representation  of  the  observed  contingency  table  x(u>)  but  not  in  the 
representation  of  the  estimated  table  x*(w)  are  zero  and  the  number  of 
these  taus  is  the  number  of  degrees  of  freedom.  Similarly  2I(x£:x*) 
tests  a  null  hypothesis  that  the  values  of  the  t  parameters  in  the  repre¬ 
sentation  of  the  estimated  table  x*(w)  are  zero  and  the  number  of  these 
taus  is  the  number  of  degrees  of  freedom. 

We  summarize  the  additive  relationships  of  the  m.d.l.  statistics 
and  the  associated  degrees  of  freedom  in  the  Analysis  of  Information 
Table  9.1. 
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TABLE  9.1 

ANALYSIS  OF  INFORMATION  TABLE 


Component  due  to 


Information 


D.F. 


H„ 


Interaction 


2I(x:x  ) 
a 


Ut, 


Effect 

Interaction 


2I(VXa} 

2I(x:x^) 


1  -  N, 

a  h 

N. 


Since  measures  of  the  form  2I(x:x  )  may  also  be  interpreted  as  measures 

3 

of  the  "variation  unexplained"  by  the  estimate  xa  ,  the  additive  rela¬ 
tionship  leads  to  the  interpretation  of  the  ratio 

*  *  *  * 


(9.2) 


2I(x:xfl)  -  2I(x:x^)  2I(x^:xa) 


2I(x:x*) 

a 


2I(x:x  ) 
a 


as  the  percentage  of  the  unexplained  variation  due  to  x  accounted  for 

*  a 

by  the  additional  constraints  defining  x^  .  The  ratio  (9.2)  is  thus 
similar  to  the  squared  correlation  coefficients  associated  with  normal 
distributions , 


We  remark  that  the  marginals  explicit  and  implicit  of  the  estimated 
*  * 
table  x  (w)  which  form  the  set  of  restraints  H  used  to  generate  x  (m) 

3  3  3 

are  the  same  as  the  corresponding  marginals  of  the  observed  x(u>)  table 
and  all  lower  order  implied  marginals.  It  may  be  shown  that  2I(x:x*)  Is 
approximately  a  quadratic  in  the  differences  between  the  remaining  mar¬ 
ginals  of  the  x(u>)  table  and  the  corresponding  ones  as  calculated  from 

* 

the  x  (w)  table, 
a 

Similarly  2I(x£:x*)  is  also  approximately  a  quadratic  in  the 
differences  between  those  additional  marginal  restraints  in  but  not 

in  H  and  the  corresponding  marginal  values  as  computed  from  the  x*(w) 
table. 


As  may  be  seen,  because  of  the  nature  of  the  T(u)  functions 
described  in  Section  7.1  or  indicated  in  Figure  8.1,  the  x's  are  deter¬ 
mined  from  the  log-linear  regression  Equations  (6.7)  (see  (8.2)  and  (11.3)) 
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as  sums  and  differences  of  values  of  in  x  (ijki)  .  A  variety  of  statis¬ 
tics  have  been  presented  in  the  literature  for  the  analysis  of  contingency 
tables  which  are  quadratics  in  differences  of  marginal  values  or  quadratics 
in  the  t's  or  the  linear  combinations  of  logarithms  of  the  observed  or 
estimated  values.  The  principle  of  minimum  discrimination  information 
estimation  and  its  procedures  thus  provides  a  unifying  relationship  since 
such  statistics  may  be  seen  as  quadratic  approximations  of  the  minimum 

discrimination  information  statistic.  We  remark  that  the  corresponding 
2 

approximate  X  's  are  not  generally  additive. 

We  mention  the  approximations  in  terms  of  quadratic  forms  in  the 
marginals  or  che  t's  as  a  possible  bridge  connecting  the  familiar  pro¬ 
cedures  of  classical  regression  analysis  and  the  procedures  proposed  here 
to  assist  in  understanding  and  interpreting  the  analysis  of  information 
tables.  The  covariance  matrix  of  the  T (oo)  functions  or  the  taus  can 
be  obtained  for  either  the  observed  table  or  any  of  the  estimated  tables, 
as  well  as  the  inverse  matrices  as  part  of  the  output  of  the  general 
computer  program. 

10 .  Outliers 


We  define  outliers  as  observations  in  one  or  more  cells  of  a  con¬ 
tingency  table  which  apparently  deviate  significantly  from  a  fitted  model. 
These  outliers  may  lead  one  to  reject  a  model  which  fits  the  other 
observations.  For  example,  in  multi-dimensional  contingency  tables  in 
which  time  or  age  is  one  of  the  classifications  there  may  occur  an  age 
effect  such  that  a  model  may  be  rejected  for  the  entire  table  but  a  model 
taking  the  possible  age  effect  into  account  may  lead  to  an  acceptable 
partitioning  of  the  model. 

In  other  cases  even  though  a  model  seems  to  fit,  the  outliers  con¬ 
tribute  much  more  than  reasonable  to  the  measure  of  deviation  between  the 
data  and  the  fitted  values  of  the  model.  In  other  words,  the  outliers 
make  up  a  large  percentage  of  the  "unexplained  variation’'  2I(x:x  )  , 

A  clue  to  possible  outliers  is  provided  by  the  output  of  the  com¬ 
puter  program.  In  the  computer  output  for  each  estimate  five  entries  are 
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listed  for  each  cell.  The  fourth  of  these  is  titled  OUTLIER  and  its 
numerical  value  provides  a  lower  bound  for  the  decrease  in  the  corre¬ 
sponding  2I(x:x*)  if  that  cell  were  not  included  in  the  fitting 
procedure.  Since  the  reduction  in  the  degrees  of  freedom  is  one  for  each 

omitted  cell,  values  of  OUTLIER  greater  than  say  3.5  are  of  interest.  The 

* 

basis  for  the  OUTLIER  computation  and  interpretation  follows.  Let  x^ 

denote  the  minimum  discrimination  information  estimate  subject  to  certain 

* 

marginal  restraints.  Let  x,  denote  the  minimum  discrimination  infor- 

"  * 

mat  ion  estimate  subject  to  the  same  marginal  restraints  as  x  except 

*  ® 

that  the  value  x(w^)  ,  say,  is  not  included,  so  that  -  x(u^)  . 

The  basic  idditivity  property  of  the  minimum  discrimination  information 
statistics  states  that 

2I(x:x*)  =  2I(x^:x*>  +  2I(x:x£) 
or 

2I(x:xa)  -  2I(x:x^)  ■  2I(x^:xa)  . 

These  results  are  summarized  in  the  Analysis  of  Information  Table  10.1. 


TABLE  10.1 

ANALYSIS  OF  INFORMATION  TABLE 
Component  due  to  Information 


D.F. 


2I(x:xa) 


:  Same  as  Ha  but  omitting  x(w^)  2I(x^:xa) 


2 I ( x : x^ )  N^ 


1 

N 


-  1 


But 


*  * 


2I(xb:xa)  =  2  IW  £n  ~ 


V“1> 


+  E 


Xa(wl)  n"Ul 


xb(u,) 


in 


xb(u,) 

* 

x  (w) 
a 


(10.1) 


x(u  )  ^  x£(w) 

■  2|x(u1 )  in  — ; -  +  E  x^Cw) 


X  (w  )  «-U), 

a  i  l 


xa(u0> 


and  using  the  convexity  property  which  implies  that 
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* 

* 

(10.2)  nZ  *b(w)  £n  “*7~ 

O-o).  x  (w) 

1  a 


(  T.  *>) 

(  t  *£(«))  In  pV-( 


i  * 

(n  -  x^u^)) 


In 


n  -  w 

n  -  x*(Ul) 


we  get  from  (10.1)  that 

*  *  >  / 

2I(xb:xa)  -  2[x(u 


(10.3) 


( 1  x£(w)\ 

x(ui)  /  *  \  \fi— u>i  f 

1  *n  ~"*7  7  +  inE  V"*)  *n  7  *  Y 

Xa  U1  \n'Wl  /  (  Z  Xa(w)] 

rui  / 


fx(uj.)  n  -  x(u>  ) 

x(w^)  In  — j - +  (n  -  x(w^))fn 


xa(tol> 


n  -  x  (oj  ) 
a  l 


The  last  value  can  be  computed  and  Is  Hated  as  the  OUTLIER  entry  for  each 
cell  of  the  computer  output  for  the  estimate  x*  . 


The  ratio 


*  * 


(10.4) 


2l(x:xa)  -  2I(x:xb)  21^^) 


2I(x:xa) 


21 (x:x  ) 


then  indicates  the  percentage  of  the  "unexplained  variation"  due  to  the 
outlier  value. 


11.  The  2x2  Table 


It  may  be  useful  to  reexamine  the  2x2  table  from  the  point  of 
view  of  the  preceding  discussion.  The  algebraic  details  are  simple  in  this 
case  and  exhibit  the  unification  of  the  information  theoretic  development. 


Suppose  we  have  the  observed  2x2  table  in  Figure  11.1. 


x(ll) 

x(12) 

j  x(l.) 

x(21) 

x(22) 

I  x(2.) 

x(.l) 

x(.2) 

1  n 

Figure  11.1. 
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If  we  obtain  the  m.d.i.  estimate  fitting  the.  one-way  marginals,  the 

generalized  independence  hypothesis  is  the  classical  independence  hy- 

* 

pothesis  and  the  minimum  discrimination  information  estimate  it  x  (ij)  = 
x(i.)x(.J)/n  .  The  representation  of  the  log-linear  regression  (6.7)  as 
in  Figure  8.1  for  the  full  model  is  given  in  Figure  11.2.  The  entries  in 
the  columns  Ti  *  T2  *  T3 


Figure  11.2. 


are,  respectively,  the  values  of  the  functions  T^(ij)  ,  T^Cij)  ,  T^(ij) 
associated  with  the  marginals  6^  ■  x(l.)  ,  6^  3  x(.l)  ,  0^  *  x(ll)  , 
and  the  column  headed  L  corresponds  to  the  normalizing  factor  (the 
negative  of  the  logarithm  of  the  moment-generating  function  as  in  (6.7)). 


We  recall  the  interpretation  of  Figure  11.2  as  the  log-linear 
relations 

in  X-— —  “L+T..+T  +T 
nit  12  3 


(11.1) 


tni02I.L+  T 

nir  1 


nrr  , 


in  -  L  . 

nir 


From  (11.1)  we  find 


L  -  in  (x(22)/n/4)  , 
rx  -  lu  (x(12)  /x(22) )  , 
i2  -  in  (x(21) /x(22) )  , 
t3  -  in  (x(ll)x(22)/x(12)x(21) ) 
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(11.3) 


-  £n  x(12)  -  in  x(22)  , 

m  2.n  x(21)  -  in  x(°2)  , 

t ^  “  Jin  x(ll)  +  Jin  x(22)  -  Jin  x(12)  -  in  x(21)  . 


If  we  call  T  the  matrix  with  columns  the  columns  of  the  design  matrix 
of  Figure  11.2,  that  is, 


(11.4) 


1111 
110  0 
10  10 


1  0  0  0, 


and  define  a  diagonal  matrix  D  with  main  diagonal  the  elements  x(ij)  , 


that  is , 


(11.5) 


'x(ll)  0  0  0  \ 

0  x(12)  0  0  j 

0  0  x(21)  0  I 

>  0  0  0  x(22)'  , 


then  the  estimate  of  the  covariance  matrix  of  0^  *  x(l.)  ,  0^  '  x(.l)  , 
0^  -  x(ll)  for  the  observed  contingency  table  is  E_  =*  ^  where 


(11.6) 


-11  -12 


■21  -22 


-  T'DT 


(11.7) 


-22.1  ’  -22  ‘  -21  -11  -12 


and  A1 1  is  lxl,  is  3x3,  ■  A^  Is  1x3.  It  is 


-11 

found  that 


(11.8) 


XU.  XU., 


E  »  x(ll)  - 


x(ll) 


_  x(l.)x( . 1)  x(ll) x(2 ■ ) 


XCi. )Xl .. 


x( . i)x( . , 


x(ll) 


_  iOi y. 
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and  the  inverse  matrix  is 


(11.9)  Z 


x(12)  T  x(22) 
1 

x(22) 

1  1 
x(12)"x(22) 


1 

*(22) 


-A-  +  -L. 

x(21)  x(22) 


1  1 
x(21)~x(22) 


1  1 

'x(12)“x(22) 

1  1 

x(21)  '  x(22) 


_J_  +  JL  +  _L_  +  J_ 

x(ll)  x(12)  x(21)  x(22) 


We  remark  that  the  matrix  in  (11.9)  is  the  covariance  matrix  of  the  t's 
in  (11.3). 

Note  that  the  value  of  the  logarithm  of  the  cross-product  ratio,  a 
measure  of  association  or  interaction, appears  in  the  course  of  the  analysis 
as  the  value  of  t_  for  the  observed  values  x(ij)  ,  and  that  t  =  0  for 
x  (ij)  ,  the  estimate  under  the  hypothesis  of  independence,  for  which  the 
representation  as  in  Figure  11.2  does  not  involve  the  last  column  since  it 
is  obtained  by  fitting  the  one-way  marginals. 

* 

The  log-linear  relations  for  the  estimate  x  (ij)  are 

f In  -  L  +  t.  +  t 

/  mr  12 


(11.10' 


.  x  (12) 

In  — - — -  -  L  +  t. 
mr  1 


tn  i-Uii  -  L  +  T, 
nir  2 


where  the  numerical  values  of  L  ,  in  (11.10)  depend  on  x  and 

differ  from  the  values  in  (11.1). 

The  minimum  discrimination  information  statistic  to  test  the  null 
hypothesis  or  model  of  independence  is  2I(x:x*)  with  one  degree  of  free¬ 
dom.  In  this  case  the  quadratic  approximation  is 

(11.11)  2I(x;x*)  «  (x(ll)  -  +  _J_  +  —L.  +  -J-  \  . 

\x  (11)  x  (12)  x  (21)  x  (22)/ 
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Remembering  that  x  (ij)  ■  x(i.)x(.j)/n  ,  the  right-hand  side  of  (11.11) 
may  also  be  shovm  to  be 

(11.12)  X2  -  E  (x(ij)  -  x(i . ) x( . j ) /n) 2/yl1-<Ax-(-:J >  , 

2 

the  classical  x  -test  for  independence  with  one  degree  of  freedom.  Another 
test  which  has  been  proposed  for  the  null  hypothesis  of  no  association  or 
no  interaction  in  the  2x2  table  is 

(11.13)  (in  x(ll)  +  in  x(22)  -  in  x(12)  -  in  *(21)  )2(x7TI)+xTT2)+x'(il)+xTl2) 

A 

which  may  be  shown  to  be  a  quadratic  approximation  for  21(x:x  )  in  terms 

of  with  the  covariance  matrix  estimated  using  the  observed  values  and 

not  the  estimated  values.  We  remark  that  if  the  observed  values  are  used 

2 

to  estimate  the  covariance  matrix  then  instead  of  the  classical  x  -test  in 
(11.12)  there  is  derived  the  modified  Neyman  chi-square 

(11.14)  X-j2-  E  (x(ij)  -  x(i.)x(.j)/n)2/x(ij)  . 

12 .  An  Analysis 

In  order  to  coordinate  and  relate  the  various  definitions,  concepts, 
parameters,  computational  features,  etc.  discussed  in  the  preceding  sec¬ 
tions  we  shall  consider  in  detail  the  analysis  of  a  specific  contingency 
table . 

Table  12.1  is  a  four-way  contingency  table  of  14,053  marines  who 
enlisted  in  1966  or  1967,  cross-classified  on  the  variables  home  of 
record,  level  of  education,  race  and  boot  camp  completion.  We  denote  the 
occurrences  in  the  four-way  cross-classification  or  contingency  Table  12.1 
by  x(ijki)  with  the  notation 


Variable 

Index 

1 

2 

3 

4 

Home  of  Record 

i 

East 

North 

West 

South 

Level  of  Education 

J 

Below  H.S. 

H.S. 

Above  H.S. 

Race 

k 

White 

Non-white 

Boot  Camp 

l 

Failed 

Passed 
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For  this  data  we  are  interested  in  the  possible  relationship  of  success 
in  boot  camp  as  a  dependent  variable  on  the  independent  or  explanatory 
variables  home  of  record,  level  of  education,  and  race.  To  obtain  a 
smoothed  estimate  of  the  observed  cross-classification  utilizing  signifi¬ 
cant  effects  and  interactions  we  shall  examine  a  sequence  of  minimum 
discrimination  information  estimates  based  on  nested  sets  of  fitted 
marginals.  That  is,  each  successive  estimate  uses  a  set  of  marginals 
which  explicitly  or  implicitly  contains  the  marginals  of  the  preceding 
estimate  and  also  additional  ones  to  determine  the  effect  of  the  addi¬ 
tional  marginals  or  their  associated  interaction  tau  parameters.  The 
analysis  of  information  table  permits  us  to  judge  the  significance  or 
non-significance  of  these  effects  or  interaction  tau  parameters. 

12.1.  Fitting  Nested  Sets  of  Marginals.  Since  we  are  interested  in 
the  possible  relationship  of  success  in  boot  camp  on  home  of  record,  level 
of  education  and  race,  we  first  fit  the  marginals  x(ijk.)  ,  x(...S.) 
since  the  corresponding  estimate  x  (ijkJt)  -  x(i  jk .)  x(  ...*.)  /n  is  that 
under  the  null  hypothesis  or  model  of  independence  of  success  and  the 
joint  variable  (home  of  record,  level  of  education,  race)  or  no  inter¬ 
action  between  success  and  the  joint  variable.  In  other  words  we  first 
want  to  determine  whether  the  24  columns  of  Table  12.1  are  homogeneous 
or  not  with  respect  to  the  underlying  probabilities  of  passing  or  failing. 
The  associated  m.d.i.  statistic  is 

2I(x:x  )  *  2  EE  E  E  x(ijki)  £n(x(ijk£)/x  (ijki))  ■  160.551 

with  23  degrees  of  freedom.  We  reject  the  hypothesis  of  independence  or 
no  interaction.  We  therefore  shall  look  for  explanatory  effects. 

In  Figure  12.1  there  is  given  the  complete  schematic  for  the  log- 
linear  representations.  The  representation  for  the  estimate  of  joint 

it 

independence  x  (ijkft)  -  x(ijk. )x( . . .1) /n  uses  columns  1-17,  21-22, 

26-31  cor;.u  ;;  ending  to  all  the  marginals  explicit  and  implicit  in  the 

A 

fitted  rr  r;  onstraints.  We  can  also  interpret  2I(x:x  )  as  testing 
a  null  hy  ■  or  model  that  the  23  tau  parameters  in  the  representa- 

A 

tion  of  x  t  lot  in  x  are  zero,  that  is,  the  parameters  corresponding 
to  columns  20,  23-25,  32“48. 
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The  value  of  2I(x:x  )  is  so  large  that  we  reject  the  model  ol 
Joint  independence.  We  therefore  proceed  to  fit  a  sequence  of  nested 
marginals  all  including  x(ijk.)  and  various  combinations  of  two-  and 
three-way  marginals  containing  success  with  other  variables.  We  summa¬ 
rize  some  results  in  the  truncated  Analysis  of  Information  Table  12.2. 

We  have  not  included  all  the.  intermediate  fitting  sequences  for  concise¬ 
ness.  We  remark  that  although  the  measure  of  the  effect  of  additional 
marginals  or  their  associated  parameters  may  vary  according  to  the 
sequence  in  which  they  have  been  added,  significant  effects  tend  to 
remain  significant  and  non-significant  effects  tend  to  stay  non¬ 
significant  so  that  the  first  overall  survey  should  determine  the 
estimates  and  interaction  parameters  which  warrant  further  investigation. 
For  example,  the  effect  of  adding  x(..k£)  to  x(ijk.)  ,  x( i . . k)  , 

•k  k 

x(.j.A)  is  given  in  Analysis  of  Information  Table  12.3  as  21(x  :x  )  = 

I  Si 

1.410  with  one  degree  of  freedom,  but  the  effect  of  adding  x(..k£)  to 

x(ijk.)  ,  x(ij .  £)  is  given  in  Analysis  of  Information  Table  12.2  as 
k 

2I(x  :x  )  *  1.239  with  one  degree  of  freedom.  In  neither  case  is  the 
e  m  .  . 

k  il 

effect  or  the  corresponding  tau  parameter  significant. 

The  columns  of  Figure  12.1  which  occur  in  the  log-linear  repre¬ 
sentations  of  the  estimates  retained  in  Analysis  of  Information  Table 
12.2  are 


Marginals  Fitted 

Estimate 

Columns  of  Figure 

12.1 

x(ijk. )  ,  x( . . .£) 

* 

X 

1-17,  21-22,  26 

-31 

x(ijk.) ,  x(i.  .£.)  ,  x(.  j  .£) 

* 

xa 

1-24,  26-31 

x(ijk. )  ,  x(ij  .  £.) 

1-24,  26-37 

x(ijk. ) ,  x(ij.fc),  x(..k£) 

< 

1-37  . 

From  the  analytic  form  of  the  log-linear  representation  or  by 
taking  differences  of  appropriate  rows  of  Figure  12.1  within  the  columns 
used  for  the  estimate,  the  log-odds  of  fail  to  pass  for  each  of  the 
estimates  are  given  by  the  respective  parametric  representations  in  (12.1) 
where  the  superscripts  relate  to  the  variables  and  the  subscripts  range 
over  the  possible  indices.  The  values  of  the  parameters  depend  of  course 
on  the  corresponding  estimate. 
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TABLE  12.2 

ANALYSIS  OF  INFORMATION  TABLE 
Component  Due  to _ Informal  on _ D. F. 


x(ijk.) ,  x( . . .£) 

2I(x:x*) 

-  160.551 

23 

a) 

x(ijk.),  x(i . . £)  ,  x(.j.£) 

2I(x* :x*> 
a 

-  138.732 

5 

2I(x:x*) 

a 

-  21.819 

18 

m) 

x(ijk.) ,  x(ij .£) 

2I<Vxa> 

-  7 . 384 

6 

2l(x:x*) 

m 

=  14.435 

12 

e) 

x(ijk.),  x(ij . £) ,  x(..k£) 

21 (x* :x*) 

=  1.239 

1 

2l(x:x*) 

=  13.196 

11 

2I(x:x*)  -  2I(x:x*) 
A 

138.732 

>0.86 

2I(x:x*) 

160.551 

2I(x:x  )  -  2I(x:x  ) 
m 

146.116 

0.91 

2I(x:x*) 

160.551 

2I(x:x*)  -  2I(x:x*) 

147.355 

0.92 

2I(x:x*) 

160.551 

TABLE  12.3 

ANALYSIS  OF  INFORMATION  TABLE 
Component  Due  to  Information  D.F. 

a)  x(ijk.),  x(i..£),  x(.Ji£)  2I(x:x*)  -  21.819  18 

A 

f)  x(ijk.),  x(i . . £) ,  x(.j«£),  x(..k£)  2I(x*:xa)  -  1.410  1 

2I(x:x*)  -  20.409 
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Jin 


x^Cijkl) 

x*(ijk2) 

a 


Z  .  i£  jJl 

T  *4-  T  4  TJ 

1  11  jl 


(12.1) 


In 


x  (ijkl) 

m _ 

x*(ijk2) 

m 


Jl  .  iJl  ,  jH  .  ij Jl 

T1  +  Til  +  Tjl  +  Tijl 


In 


xe(ijkl) 

x*(ijk2) 


Z  ^  iZ  ^  JZ  ^  kJl  ijJl 

T  4*  T  4-  +  T  4*  T  ^ 

1  il  jl  11  ijl 


We  recall  that  parameters  with  indices  i  -  4  and/or  j  =  3 
and/or  k  =  2  and/or  Jl  »  2  are  by  convention  set  equal  to  zero. 

* 

We  remark  that  x  (ijkH)  ,  determined  by  fitting  the  marginals 
m 

x(ijk.)  ,  x(ij.Jl)  ,  is  expressible  explicitly  as 


(12.2) 


x  (ijkJl)  *  x(ijk.  )x(ij  .1)  /x(ij  .  . ) 
m 


and  is  the  estimate  under  a  null  hypothesis  that  race  and  success  are 
conditionally  independent  given  home  of  record  and  level  of  education. 

In  Analysis  of  Information  Table  12.2  the  value  2l(x:xm)  ■  14.435  , 

12  degrees  of  freedom,  indicates  an  acceptable  fit  of  this  model.  Fur¬ 
thermore,  2I(x*:x*)  -  1.239  ,  one  degree  of  freedom,  implies  that  the 
e  m 

additional  effect  of  the  marginal  x(..k£)  is  not  significant  or  that 

in  the  parametric  representation  of  the  log-odds  in  (12.1)  the  parameter 

kJl 

measuring  the  effect  of  race  on  the  dependent  variable  success  is 

not  significant.  We  therefore  investigate  the  estimate  x*  in  greater 

m 

detail.  The  values  of  x  (ijkJl)  are  given  in  Table  12.4. 


*  Z 

In  the  expression  for  the  log-odds  under  x_  in  (12.1)  t,  is 


an  overall  average,  x 


iZ 

il 


and 


Jl 


m  '  1 

are  the  effects  of  home  of  record  and 
ijJl 

level  of  education  on  boot  camp  completion  and  1®  interaction 

effect  of  home  of  record  x  level  of  education  on  boot  camp  completion. 


The  numerical  values  of  the  tau  parameters  are  given  in  Table  12.5.  We 
recall  that  by  convention  parameters  with  an  index  corresponding  to 
i  ■  4  and/or  j  =  3  and/or  Jl  ■  2  are  equal  to  zero. 
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TABLE  12.5 


VALUES  OF  PARAMETERS  IN  LOG-ODDS  FOR  x*  IN  (11.1) 

m 


x*  »  -4.454347 
x*J  -  0.728653 

-  0.041549 
=  -1.632427 
=  1.312903 

-  0.648130 


Tm  =  -°-292478 

T121  =  _0*689433 

T211  '  -°-602433 

T221  "  -1-003045 
TjJJ  =  1.137932 

T321  *  0.360697 


From  the  parametric  representation  of  the  log-odds  in  (12.1)  and 
the  values  in  Table  12.5  one  can  determine  differences  in  the  log-odds 
associated  with  changes  in  various  categories.  Thus  the  differences  in 
the  log-odds  (fail  to  pass)  as  one  changes  the  home  of  record,  for  fixed 
level  of  education, are  given  by 


E-N 

E-W 

E-S 

Below 

H.S. 

0.9970 

0.7287 

0.4362 

H.S. 

1.0007 

1.3110 

0.0392 

Above 

H.S. 

0.6871 

2.3611 

0.7287 

The  differences  in  the  log-odds  as  one  changes  the  level  of  education  for 
fixed  home  of  record  are  given  by 


Below  H.S. -H.S. 

H.S. -Above  H.S. 

East 

1.0617 

-0.0413 

North 

1.0654 

-0.3549 

West 

1.4420 

1.0088 

South 

0.6648 

0.6481 

For  easier  interpretation,  we  convert  the  log-odds  values  to  ratios 
of  the  odds  of  failure. 
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E-N 

E-W 

E-S 

Below  H.S. 

2.7 

2.1 

1.6 

H.S. 

2.7 

3.7 

1.0 

Above  H.S. 

2.0 

10.6 

2.1 

Below  H.S. -H.S. 

H.S. -Above  H.S. 

East 

2.9 

0.96 

North 

2.9 

0.70 

West 

A. 2 

2.7 

South 

1.9 

1.9 

Note  that  the  odds  of  failure  in  boot  camp  of  a  recruit  with  home 
of  record  East  and  Above  H.S.  level  of  education  are  10.6  times  the  odds 
of  a  recruit  with  the  same  level  of  education  but  home  of  record  West. 
Recruits  with  home  of  record  East  or  North  but  with  level  of  education 
H.S.  do  better  than  recruits  with  same  home  of  record  but  Above  H.S.  level 
of  education. 

*  * 

We  have  also  computed  the  odds  of  failure  x  (ijkl)/x  (ijk2)  and 

m  m 

listed  the  results  in  increasing  values.  The  odds  are  expressed  to  1,000, 
that  is,  5  to  1,000,  6  to  1,000,  etc. 


Home  of  Record  Level  of  Education  Odds 


West 

Above  H.S. 

2 

West 

H.S. 

6 

North 

H.S. 

9 

South 

Above  H.S. 

.12 

North 

Above  H.S. 

12 

South 

H.S. 

22 

East 

H.S. 

23 

East 

Above  H.S. 

2A 

North 

Below  H.S. 

25 

West 

Below  H.S. 

26 

South 

Below  H.S. 

A3 

East 

Below  H.S. 

67 
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Note  that  the  overall  odds  of  failure  for  this  data  are  311/13742  = 
0.0226  or  23. 

For  ease  of  comparison  and  inference,  we  also  list  the  foregoing 
results  by  home  of  record  and  level  of  education. 


West 

North 

South 

East 

Above  H.S. 

2 

12 

12 

24 

H.S. 

6 

9 

22 

23 

Below  H.S. 

26 

25 

43 

67 

Examination  of  the  computer  output  for  x  (ijk£)  shows  that  for 

m 

West,  Above  H.S.,  Non-white,  Fail,  the  value  of  OUTLIER  is  4.28.  From 
Table  12.1,  we  see  that  the  corresponding  observed  values  are  given  by 
the  two-way  table 

West,  Above  H.S.,  x(33k£) 


White 

Non-white 

Fail 

0 

1 

1 

Pass 

421 

19 

440 

421 

20 

441 

and  from  Table  12.4,the  corresponding  estimated  values  are 

* 

West,  Above  H.S.,  x  ( 3 3k.£) 

m 


White 

Non-white 

Fail 

0.955 

0.045 

1.000 

Pass 

420.045 

19.955 

440.000 

421.000 

20.000 

441.000 

Testing  the  observed  two-way  table  West,  Above  H.S.,  x(33k£)  for 
independence  of  race  and  boot  camp  completion  by  the  statistic 

2  Z  Z  x(33U)  £nfx(33k£)/X<>33/;^x(?3,fcM»  ill  l  x(33k£)  £n  x(33k£) 
k  £  \  '  x(33..)  J  (k  i 


+  x(33..)  £n  x(33..)  -  I  x(33k.)  £n  x(33k.)  -  E  x(33.£)  £n  x(33.£ 


TK-1 1  16 


yields  the  value  6.236,  one  degree  of  freedom.  (Tables  of  2n  in  n  , 
n  an  integer  1  to  10,000,  are  available  for  such  calculations.)  The 

■k 

contribution  of  West,  Above  H.S.  to  the  value  of  2I(x:x  )  is  obtained 

m 

by  the  computer  as 

2(°  ln  0.955  +  1  ln  0.045  +  421  ln  420.045  +  19  ln  19.955^ 
and  yields  the  same  value  6.236. 

Because  the  value  6.236  is  statistically  significant  at  the  0.02 
level,  the  OUTLIER  statistic  has  shown  an  "unusual"  situation  for 
x^(ijk£)  corresponding  to  West,  Above  H.S. 

We  shall  consider  the  procedure  to  account  for  outliers  after  we 

A 

examine  the  estimate  x 

a 

In  view  of  the  fact  that  the  Analysis  of  Information  Table  12.2 

A 

she /8  no  significant  effects  for  the  estimates  following  x  and  since 

3 

2l(x:x*)  -  21.819  ,  18  degrees  of  freedom,  implies  an  acceptable  fit, 

3 

let  us  examine  the  estimate  xfl  with  possible  outliers  in  mind.  The 
values  of  the  estimate  x*  are  given  in  Table  12.6. 

The  log-odds  of  fail  to  pass  for  x*  are  given  in  (12.1)  with 

3 

the  parameters  having  the  same  interpretation  as  those  for  x*  except 

m 

that  there  is  no  interaction  effect.  The  values  of  the  parameters  for 
x*  are  given  in  Table  12.7. 

3 

For  the  estimate  x*  the  ratio  of  the  odds  of  failure  between 

a 

different  homes  of  record  is  the  same  for  all  levels  of  education  and, 
of  course,  the  ratio  of  the  odds  of  failure  for  different  educational 
levels  is  the  same  for  all  homes  of  records  For  the  ratio  of  odds  ,  and 
odds,  see  Tables  12.8,  12.9  and  12.10. 

Examination  of  the  computer  output  for  x*  shows  an  OUTLIER  value 
of  5.20  for  West,  Above  H.S.,  White,  Fail  and  an  OUTLIER  value  of  3.54 
for  South,  H.S.,  Non-white,  Fail.  The  corresponding  observed  and  esti¬ 
mated  cell  entries  are 
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TABLE  12.7 

PARA!  JETER  VALUES  IN  LOG-ODDS  REPRESENTATION 


* 

* 

* 

X 

a 

*b 

X 

c 

i 

1 

-4.192224 

-4.059831 

-4.105023 

i  £ 

11 

0.285423 

0.288534 

0.364671 

it 

21 

-0.680394 

-0.680769 

-0.602516 

it 

31 

-0.889058 

-0.771589 

-0.690762 

j* 

11 

1.168221 

1.025047 

1.019191 

j* 

21 

0.212164 

0.067678 

-0.001819 

TABLE  12.8 

RATIOS  OF  THE  ODDS  OF  FAILURE 
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TABLE  12.10 

ODDS  OF  FAILURE,  EXPRESSED  TO  1,000 


West 

North 

South 

East 

* 

X 

a 

6 

8 

15 

20 

Above  H.S. 

* 

00  O 

White, 

Non-white 

9 

17 

23 

* 

X 

c 

0 

8 

White, 

Non-white 

9 

16 

24 

H 

8 

9 

19 

25 

H.S. 

9 

9 

18 

25 

H 

8 

9 

16 

32 

White, 

Non-white 

24 

H 

20 

25 

49 

65 

Below  H.S. 

i 

22 

24 

48 

64 

* 

X 

c 

23 

25 

46 

66 
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West,  Above  H.S. 


x(33k£)  x*(33ki) 


White 

Non-white 

White 

Non-white 

Fail 

0 

1 

2.599 

0.123 

Pass 

421 

19 

418.401 

19.877 

421 

20 

421.000 

20.000 

South,  H.S. 


x(42k£)  x*(42k£) 

a 


White 

Non-white 

White 

Non-white 

Fail 

34 

16 

32.557 

9.611 

Pass 

1741 

508 

1742.443 

514.389 

1775 

524 

1775.000 

524.000 

12.2.  The  Estimate  x^ijki)  Adjusted  for  Outliers.  For  all  estimates 

considered  under  the  nested  marginal  hypotheses,  a  requirement  was  that 
* 

x  (ijk.)  -  x(ijk.)  .  Accordingly  for  the  model  with  interaction  we  require 
the  modified  estimate  to  be  fitted  using  the  marginals  x(ijk.)  ,  x(ij.i) 
derived  from  all  observations  except  the  outliers  x(3311)  and  x(3312)  . 
We  shall  use  the  observed  values  as  the  estimates  for  the  outlier  cells. 
Thus  if  we  denote  the  modified  estimates  by  x*(ijk£)  we  have  x*(3311)  *■ 
x(3311)  and  x*(3312)  -  x(3312)  . 

Because  of  the  marginals  used  for  fitting,  it  turns  out  that  the 
values  of  the  modified  estimate,  x*(ijkl)  are  equal  to  the  values  of  the 
original  estimate  x*(ijki)  (since  x*(ijki)  -  x(ijk.)x(ij . i) /x(ij . .) ) 
except,  of  course,  for  the  cells  (3311)  and  (3312)  ,  and  to  satisfy  the 
requirement  that  x*(ij.£)  -  x(ij.l)  it  follows  that  x*(3321)  -  x(3321)  , 
x*(3322)  -  x( 3322)  .  The  associated  Analysis  of  Information  Table  12.11 
follows . 
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TABLE  12.11 


ANALYSIS  OF  INFORMATION  TABLE 


Component  Due  to 

ra)  x(ijk.) ,  x(iJ.Ji) 

r)  x(ijk.) ,  x(ij . 1)  less 
x(3311) ,  x(3312) 


Information  D.F. 
2I(x:x£)  -  14 .435  12 
2I(x*:x*)  -  6.235  1 
2l(x:xJ)  -  8.200  11 


it  it  it 

Note  that  since  x  (ijki)  -  x  (ijki)  except  that  x  (3311)  - 
^  r  m  r 

x(3311)  ,  xr(3312)  -  x(3312)  ,  xj(3321)  -  x(3321)  ,  x*(3322)  -  x(3322)  , 

the  value  of  the  measure  of  effect  2l(x*:x*)  is  the  same  as  that  earlier 

r  m 

derived  in  the  test  for  conditional  independence. 


The  global  inference  that  race  and  boot  camp  completion  are  condi¬ 
tionally  independent  is  valid  except  for  West,  Above  H.S.,and  with  the 
estimate  x*  the  odds  of  failure  for  White  are  zero  whereas  they  are  53 
in  1,000  for  Non-white. 

Since  2I(x*:x*) /2I(x:x*)  ■  6.235/14.435  -  0.43  ,  we  conclude  that 
r  m  m 

the  outlier  value  West,  /bove  H.S.  accounts  for  43%  of  the  "unexplained 

variation"  2I(x:x*)  . 

m 

12.3.  The  Estimate  x*(ijki)  Adjusted  for  Outliers.  We  shall  first 

derive  a  revised  estimate  for  x*(ijki)  adjusted  for  the  outlier  x(3311)  , 
x(3312)  ,  that  is,  we  fit  the  marginals  x(ijk.)  ,  x(i..£)  ,  x(.j.i) 
excluding  the  observations  x(3311)  ,  x(3312)  (West,  Above  H.S.,  White, 

Fail;  West,  Above  H.S.,  White,  Pass).  Thus  if  we  denote  the  new  estimate 
by  xj(ijki)  we  have  x£(3311)  -  x(3311)  ,  x£(3312)  -  x(3312)  .  The 
values  of  the  estimate  x*  are  given  in  Table  12.12. 

In  particular,  note  that  for  West,  Above  H.S.,  the  corresponding 
observed  and  estimated  cell  entries  are 
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West,  Above  H.S. 

x(33k£)  x£(33k£) 


White 

Non-white 

White 

Non-white 

Fail 

0 

1 

0 

0.158 

Pass 

421 

19 

421 

19.842 

421 

20 

421 

20.000 

The  associated  Analysis  of  Information  Table  12.13  follows. 


TABLE  12.13 

ANALYSIS  OF  INFORMATION  TABLE 


Component  Due  to 

a)  x(ijk. ) ,  x(i..£),  x(.j.l) 

b)  x(ijk. )  ,  x(i..£),  x(.j.£) 

less  x(3311) ,  x(3312) 


Information 

2l(x:x*)  -  21.819 

a 

2l(x£:x*)  -  5.868 

2I(x:x£)  -  15.951 


D.F. 


18 

1 

17 


Note  that  the  OUTLIER  entry  in  the  computer  output  for  x*  ,  West,  Above 

3 

H.S. ,  White,  Fail  is  5.199  which  is  less  than  5.868  as  it  should  be. 

Also,  since  2I(x^:x*) /2l(x:x*)  ■  5.868/21.819  ■  0.27,  the  outlier  values 
account  for  27%  of  the  "unexplained  variation"  2I(x:xa)  . 

The  computer  output  for  the  revised  estimate  x£  yields  for  South, 
H.S.,  Non-white,  Fail  the  OUTLIER  entry  3.69.  Accordingly  we  now  get  a 

ff  ju 

new  revised  estimate  x  (ijk£)  .  The  estimate  x  (ijk£)  is  obtained  by 

A  A 

fitting  the  marginals  x(ijk.)  ,  x(i..£)  ,  x(.j.£)  ,  as  for  x&  and  x^ 

except  that  the  values  x(3311)  ,  x(3312)  ,  and  x(4221)  ,  x(4222)  are 

not  included,  that  is,  x*(3311)  -  x(3311)  ,  x*(3312)  -  x(3312)  ,  x*(4221)  « 

c  c  c 

x(4221)  ,  x*(4222)  ■  x(4222)  .  The  values  of  the  estimate  x*(ijk£)  ore 
given  in  Table  12.14. 
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In  particular,  note  that  for  West,  Above  H.S.,  and  South,  H.S., 
the  corresponding  observed  and  xc(ijk£)  estimates  are 


West,  Above  H.S.  South,  H.S. 

x(33k£)  x(42ki) 


White 

Non-white 

White 

Non-white 

Fail 

0 

1 

34 

16 

Pass 

421 

19 

1741 

508 

421 

20 

1775 

524 

x*(33ki)  x*(42kt.) 


White 

Non-white 

White 

Non-white 

Fail 

0 

0.154 

28.743 

16 

Pass 

421 

19.836 

1746.257 

508 

421 

20.000 

1775.000 

524  . 

The  associated  Analysis  of  Information  Table  12.13  follows. 


TABLE  12.15 

ANALYSIS  OF  INFORMATION  TABLE 


Component  Due  to 

Information 

D.F. 

a) 

x(ijk.),  x(i..i),  x(.J.i) 

2I(x:x*) 

a 

-  21.819 

18 

b) 

x(ijk.),  x(i..t),  x(.j.l)  less 

2I<V\> 

-  5.868 

1 

x(3311),  x(3312) 

2I(x:xJ) 

-  15.951 

17 

c) 

x(ijk.),  x( i ..£.),  x(.J.t)  less 

2I(x*:xJ) 

-  4.511 

1 

x(3311),  x(3312) ,  x(4221) , 

2I(x:x*) 

-  11.440 

16 

x(4222) 

c 
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A  A 

Note  that  the  measure  of  effect  2I(xc:x^)  -  4.511  is  greater  than  the 
OUTLIER  entry  for  South,  H.S.,  Non-white,  Fail,  3.691,  as  it  should  be. 
Also,  since  2I(x*:x£) /2I(x:x£)  ■  4.511/15.951  -  0.28  ,  the  outlier 
values  x(4221) ,  x(4222)  account  for  28%  of  the  "unexplained  variation" 
2I(x:x£)  . 


The  log-odds  for  the  estimates 
parametric  representation 

x*(ijkl)  l 


* 

*b 


and 


are  also  given  by  the 


(12.3) 


in 


x  (ijk2) 


ii  .  j£ 
T 1  +  Til  +  Tjl 


similar  to  that  for  x*  .  The  values  of  the  parameters  corresponding  to 
x£  and  x*  are  given  in  Table  12.7  and  the  ratio  of  odds  and  odds  of 
failure  in  Tables  12.8,  12.9  and  12.10. 


We  note  that  the  results  for  home  of  record  West  and  North  are 
better  than  those  for  home  of  record  South  and  East,  even  accounting  for 
the  outlier  values. 


13.  Zero  Marginals 

As  may  be  noted  from  the  analysis  in  Section  12,  zero  occurrences 
in  cells  of  the  observed  contingency  table  present  no  special  problem 
provided  that  no  marginal  entering  into  the  fitting  specification  is  zero. 
When  the  latter  is  the  case,  however,  the  interpretation  may  be  distorted 
because  of  inflated  degrees  of  freedom.  A  procedure  to  circumvent  this 
problem  is  similar  to  that  used  for  getting  revised  estimates  when  out¬ 
liers  are  indicated.  We  shall  present  the  procedure  in  terms  of  a 
specific  example. 

A  four-way  cross-classification  of  16,723  marines  based  on  the 
variables  home  of  record,  level  of  education,  AFQT,  and  boot  camp  comple¬ 
tion  is  given  in  Table  13.1.  We  denote  the  occurrences  in  the  four-way 
observed  cross-classification  or  contingency  table  by  x(ijk&)  with  the 
notation 
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Variable 

Index 

1 

2 

3 

4 

5 

Home  of  Record 

i 

East 

North 

West 

South 

Level  of  Education 

j 

Below  H.S. 

H.S. 

Above  H.S. 

AFQT 

k 

I 

II 

Ill 

TV  A 

IV  B 

Boot  Camp  Completion 

£ 

Fail 

Pass 

As  in  the  analysis  in  Section  12,  we  are  interested  in  the  possible  rela¬ 
tionship  of  the  variable  fail  or  pass,  as  a  dependent  variable,  on  the 
independent  or  explanatory  variables  home  of  record,  level  of  education 
and  AFQT. 

We  summarize  the  results  of  fitting  a  sequence  of  nested  marginals 
in  the  truncated  Analysis  of  Information  Table  13.2. 


TABLE  13.2 

ANALYSIS  OF  INFORMATION  TABLE 

Component  Due  to  Information  D.F. 


a) 

x(ijk. ) ,  x(. . .£) 

2I(x:x*) 

a 

-  182.828 

59 

e) 

x(ijk.),  x(i..£),  x(..k£),  x(.j.£) 

2Kx*:x*) 
e  a 

-  119.182 

9 

2I(x:x*) 

-  63.646 

50 

n) 

x(ijk.),  x(...k£),  x(ij.£) 

2Kx*:x*) 
n  e 

-  16.268 

6 

2I(x:x*) 

n 

-  47.378 

44 

We  note  that  2I(x:x*)  ■  182.828  ,  59  degrees  of  freedom,  with 

A 

x*(ijk£)  ■  x(ijk.)  x(...£)/n  rejects  the  null  hypothesis  that  boot  camp 

A 

completion  is  independent  of  the  joint  variable  (home  of  record,  length 
of  education,  and  AFQT). 

The  value  of  2I(x:x*)  -  47.378  ,  44  degrees  of  freedom,  implies 

n 

that  x*  is  a  good  estimate  and  the  value  2I(x*:xe>  ■  16.268  ,  6  degrees 
of  freedom.  Implies  that  the  marginal  x(ij.£)  and  its  associated  inter¬ 
action  parameter  for  boot  camp  completion  with  home  of  record  and  level  of 
education  is  significant.  The  values  of  x*(ijkl)  are  given  in  Table  13.3. 
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The  log-odds  of  fail  to  pass  are  given  by  the  parametric  representation 


(13.1) 


x*(ijkl) 

x*(ijk2) 


i£ 


+  Tn  +  Tjl  + 


k£ 

Tkl 


+  x 


ijt 
ijl  • 


We  note  that  in  Table  13.1  no  failures  were  recorded  for  recruits 
with  home  of  record  West  and  level  of  education  H.S.  for  all  AFQT's,  that 
is,  the  observations  x(32kl)  for  k  ■  1,2, 3, 4, 5  are  all  zero.  As  a 
consequence,  the  marginal  x(32.1)  ■  0  ,  and  the  estimates  xn(32k£) 
based  on  fitting  the  marginals  x(ijk.)  ,  x(..k£)  ,  x(ij.£)  are  equal 
to  x(32k£)  .  This  distorts  the  interpretation  on  the  basis  of  degrees 
of  freedom  and  significant  interaction  parameters. 


We  shall  therefore  follow  a  procedure  somewhat  similar  to  that  for 
OUTLIERS  adjusting  for  the  zero  marginal  values.  The  adjusted  procedure 
is  to  delete  the  observations  x(32k£)  from  the  estimation  procedure. 

The  revised  estimates  are  derived  by  fitting  the  marginals  x(ijk.)  , 
x(..k£)  ,  x(ij . £)  excluding  the  cells  with  home  of  record  West  and  level 
of  education  H.S.,  that  is,  the  cells  (32k£)  and  using  the  observed 
values  x(32k£)  as  the  estimates  for  those  cells.  The  revised  procedure 
yields  the  Analysis  of  Information  Table  13.4. 


TABLE  13.4 

ANALYSIS  OF  INFORMATION  TABLE 
Component  Due  to  Information  D.F. 

r)  x(ijk. ) ,  x(i. .£) ,  x(.j.£),  2I(x:x*)  -  51.534  45 

x(..k£),  excluding  x(32k£) 

s)  x(ijk.),  x(..k£),  x(ij.£),  2I(x*:x*)  -  4.153  5 

s  r 

excluding  x(32k£)  2I(x:x*)  -  47.381  40 

8 

Note  that  2I(x:x*)  has  45  degrees  of  freedom  compared  to  50  for  2I(x:x*) 
and  2l(x:x*)  has  40  degrees  of  freedom  compared  to  44  for  2l(x:x*)  . 
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We  now  see  that  2l(\*:x*)  *  4.153  ,  5  degrees  of  freedom,  implies 
that  adding  x(ij.Jt)  to  the  set  of  fitted  marginals,  or  the  associated 
interaction  parameters  for  home  of  record  by  level  of  education  by  failure, 
are  not  significant  and  2I(x:x*)  ■  51.534  ,  45  degrees  of  freedom  , 
implies  that  x*  is  an  acceptable  fit.  The  values  of  x*(ijk£)  are 
given  in  Table  13.5. 


The  parametric  representation  of  the  log-odds  of  failure  in  boot 
camp  using  the  estimate  x*(ijk£)  is  given  by 


(13.2) 


£n 


xj(ijkl) 

x*(ijk2) 


£  ^  i£  .  i  £  .  k£ 
T1  +  Til  +  Tjl  +  \l 


Thus  the  log-odds  depend  only  on  an  overall  average  effect  t.  and  addi¬ 

ng,  ^  1  £ 

tive  effects  due  to  hone  of  record  t..  ,  level  of  education  -H.. 
k£  1  J1 


and 


AFQT 


lkl 


with  no  interaction  effects.  The  values  of  the  parameters 


in  the  representation  of  the  log-odds  are 


t*  -  -4.376837 

-  0.145880 

-  -1.148652 

t31  *  -°-759926 
x\*:  =■  1.029758 


-  0.481840 

-  -0.665526 
k£ 

=  -0.712272 
k£ 

-  -0.639670 

-  -0.289594  . 
41 


For  convenience  we  tabulate  the  odds  of  failure  (to  1000)  in  Tables 
13.6  and  13.7.  Note  that  the  overall  odds  of  failure  for  this  data 
(excluding  West,  H.S.)  are  183/14888  *  .0123  or  12. 


Within  a  given  home  of  record  and  for  the  same  level  of  education 
the  results  for  AFQT  I,  II,  and  III  are  apparently  the  same,  with  in¬ 
creasing  odds  of  failure  respectively  for  AFQT  IV  A  and  IV  B. 

The  results  for  home  of  record  North  and  West  are  consistently 
better  than  those  for  home  of  record  South  and  East. 
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9.803  12.473  14.242  2.846  3.508  10.489  7.815  15.213  1.489  0 . 882 |  1.284  0. 

528.197  473.527  404.757  272.154  351.492  977.511  513.185  747.788  230.511  143.118  193.716  57. 
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